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Abstract — Swarm Intelligence (SI) is a relatively new paradigm 
being applied in a host of research settings to improve the 
management and control of large numbers of interacting entities 
such as communication, computer and sensor networks, satellite 
constellations and more. Attempts to take advantage of this 
paradigm and mimic the behavior of insect swarms however often 
lead to many different implementations of SI. The rather vague 
notions of what constitutes self-organized behavior lead to rather 
ad hoc approaches that make it difficult to ascertain just what SI 
is, assess its true potential and more fully take advantage of it. 
This article provides a set of general principles for SI research 
and development. A precise definition of self-organized behavior 
is described and provides the basis for a more axiomatic and 
logical approach to research and development as opposed to the 
more prevalent ad hoc approach in using SI concepts. 

The concept of Pareto optimality is utilized to capture the 
notions of efficiency and adaptability. A new concept, Scale 
Invariant Pareto Optimality is described and entails symmetry 
relationships and scale invariance where Pareto optimality is 
preserved under changes in system states. This provides a 
mathematical way to describe efficient tradeoffs of efficiency 
between different scales and further, mathematically captures 
the notion of the graceful degradation of performance so often 
sought in complex systems. 

Index Terms — swarm intelligence, self -organization, multiob- 
jective optimization, Pareto optima, finite-state machines 

I. Introduction 

TODAY'S communications networks have become enor- 
mously complex systems. New technologies from sen- 
sor networks, web-enabled PDAs, remote surgery systems 
to constellations of orbiting satellites all require enormous 
numbers of communicating and interacting entities. These 
entities must work together harmoniously to be effective. As 
the numbers of these interacting entities increases, ensuring 
their efficient operation becomes increasingly difficult. Indeed, 
for the past three decades this growth in general has approx- 
imately doubled every 18 months [1, p. 32]. New paradigms 
of modern warfare also indicate an accelerated growth in the 
numbers of interacting systems. Amidst this growth, there 
is a growing consensus among experts that current network 
management approaches will be insufficient to handle the 
level of complexity that is envisioned 1 [2]. Consequently, new 
approaches for network management and control in complex 
systems are needed. 

'As stated by the National Research Council in a report on countering 
teiTorism: "Research is also needed for self-adaptive networks that can 
reconfigure themselves in response to damage and changes in demand, and 
that can degrade gracefully." [2] 



One promising approach is based on what is often referred 
to as Swarm Intelligence (SI). The term SI has come to repre- 
sent the idea that it is possible to control and manage complex 
systems of interacting entities even though the interactions 
between and among the entities being controlled is, in some 
sense, minimal. This notion therefore lends itself to forms of 
distributed control that may be much more efficient, scalable 
and effective for large, complex systems. 

The underlying features of SI are based on observations 
of social insects. Ant colonies and beehives, for example, 
have the interesting property that large numbers of them 
seem to conduct their affairs in a very organized way with 
seemingly purposeful behavior that enhances their collective 
survival. Surprisingly and paradoxically, these insects seem to 
utilize very simple rules of interaction. This phenomenon is 
very similar to those addressed in other domains of inquiry 
involving complexity such as cellular automata and the study 
of chaos [3], [4], [5]. These areas along with SI have perplexed 
a large number of scientists for many years [6]. How is it 
that "swarms" of creatures with relatively low brain power 
and communications capabilities can engage is what is often 
termed "emergent behavior" reflective of some "collective 
intelligence" [6, p. 6] — behavior that seems to exhibit a more 
global purpose? 

Unfortunately, there is no widely agreed upon definition 
of what SI is or how it should or could be mathematically 
defined or characterized. Many terms have been associated 
with SI such as emergent behavior, self-organized behavior, 
collective intelligence, and the like and have been used in a 
variety of contexts and associated with a host of applications 
[7], [8], [9], but these terms also suffer from vague definitions 
or descriptions. There is no general, mathematically oriented 
description that ties all of these concepts together. 

The lack of precise definitions and, hence, theoretical foun- 
dations, poses a number of significant problems and even 
causes confusion. The lack of precise definitions is the least of 
the problems — this confusion also entails missed opportunities 
as well. All these different descriptions and implementations 
muddy the waters of how to productively utilize SI concepts. 
Without a clear understanding of what SI is and how and why 
it arises, it is very difficult to envision how to take advantage 
of its ttue potential. 

The remedy for this apparent confusion comes from new 
perspectives that illuminates the fundamental properties of 
SI. This article seeks to do just that by articulating some 
useful ideas based on perspectives from evolution, notions 
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of efficiency, and adaptability coupled with a more formal 
definition of self-organized behavior. 

This article formulates three strategic or foundational frame- 
works from which to build a successful theory of SI and which 
provide guidelines for how to implement SI concepts. These 
frameworks are based on the development of the successful 
theories pertaining to the simulated annealing (SA) algorithm. 
The application of these frameworks to SI may provide the 
necessary yet still missing ingredients for developing a better 
understanding of SI. This may lead to entirely new ways of 
viewing and understanding this paradigm and ultimately allow 
for more practical implementation schemes. 

The most important part of this foundational triad, and a 
necessary ingredient for developing a solid theoretical founda- 
tion for SI, is the articulation of some first principles based on 
the relevant laws of nature and their implications. This serves 
as a guide that governs and constrains the articulation of the 
other important components in developing a useful theory. For 
SI, these first principles are based on the laws of evolution and 
natural selection. As the reader will discover, its reasonable 
implications suggest a more precise and mathematically useful 
definition of self-organized behavior — the evolution of system 
states along a Pareto optimal frontier. 

The second component of this triad is the articulation of 
an appropriate dynamical framework, a way of characterizing 
system dynamics that is a consequence of or constrained 
by the first principles. The dynamical framework suggested 
here is based on a new concept in the context of SI — 
Scale Invariant Pareto Optimality (SIPO). SIPO is a powerful 
concept that captures notions of symmetry and scale invari- 
ance and can address the issues of how swarms of entities 
communicate, modify their behavior, and adapt to changing 
environmental conditions — one of the hallmarks of SI. This 
dynamical framework also provides a set of rules that, in 
effect, imposes constraints on a system's dynamics so as to 
maintain consistency with the implications of the relevant laws 
of nature as articulated in the first principles. 

Finally, the third component of this triad is the articu- 
lation of an appropriate problem framework. This provides 
useful ways to abstract these ideas and allows them to be 
implemented. It provides a concrete way of defining problems 
that help to further narrow the issues and focus research and 
development efforts. This article describes a general test-bed 
approach using the concept of swarming finite-state machine 
(SFSM) models. 

Together, this triad of frameworks, or meta-formalism, 
present a unified scheme for approaching the research prob- 
lems and investigating ways to implement SI concepts. It 
addresses how swarms of entities must communicate and 
modify their behavior in response to information from other 
entities and their environment for there to exist the emergent, 
self-organized behavior known as "swarm intelligence". This 
set of perspectives leads to a more precise mathematical 
definition of SI, describes ways to more fully take advantage 
of this paradigm and, constructively address any of its inherent 
limitations. 

The rest of this article is organized as follows: Section 
ITT1 provides more detail on the current research environment 
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regarding SI and its myriad of applications. Section |lll] de- 
scribes why the three formalisms described above provide 
a sound basis for developing the theoretical foundations of 
SI by describing similarities to the de facto frameworks 
associated with simulated annealing. Sections IIII-AI through 
IIII-CI describe the three frameworks on a conceptual level. 
Finally, Section ffVl provides concluding remarks. 

II. Background: The Swarm Intelligence Paradigm 
A. Observations of Social Insects 

Observations of social insects such as ants and ant colonies 
provide a great deal of insight into their behavior and SI in 
general. Ants and ant colonies have several ways of solving 
different but related problems. The main mechanism for solv- 
ing them is through the use of chemical substances known as 
pheromones which have a scent that decays over time through 
the process of evaporation [6, p. 26]. These pheromones form 
the basis of what amounts to a clever, and apparently simple, 
communications and information storage and retrieval system. 
Since pheromone strength or intensity decays over time, it also 
provides a very simple information processing mechanism that 
can implement forms of positive and negative feedback [6, pp. 
9-10, 41] and reinforcement learning mechanisms [6, p. 96]. 
This "processing" capability is illustrated in the simplicity of 
how ants utilize and respond to pheromones. 

As an example, consider how ants actually solve shortest 
path problems. Their motivation for solving these problems 
stems from their need to find sources of food. Efficiency 
dictates that they find sources closest to their colonies. Ants 
(many ants) first set out in search of a food source by randomly 
choosing (apparently randomly) several different paths. Along 
the way they leave traces of pheromone [6, p. 42]. Once ants 
find a food source, they retrace their path back to their colony 
(and in so doing inform other ants in the colony) by following 
their scent back to their point of origin. Since many ants go 
out from their colony in search of food, the ants that return 
first are presumably those that have found the food source 
closest to the colony or at least have found a source that is in 
some sense more accessible. In this way, an ant colony can 
identify the shortest or "best" path to the food source [6]. 

The cleverness and simplicity of this scheme is highlighted 
when this process is examined from what one could conceive 
of as the ants' perspective — they simply follow the path with 
the strongest scent (or so it seems). The shortest path will 
have the strongest scent because less time has elapsed between 
when the ants set out in search of food and when they arrive 
back at the colony, hence there is less time for the pheromone 
to evaporate. This leads more ants to go along this path further 
strengthening the pheromone trail and thereby reinforcing the 
shortest path to the food source and so exhibits a form of 
reinforcement learning [6], [10], [11]. 

But this simple method of reinforcement or positive feed- 
back also exhibits important characteristics of efficient group 
behavior. If, for instance, the shortest path is somehow ob- 
structed, then the second best shortest path will, at some later 
point in time, have the strongest pheromone, hence will induce 
ants to traverse it thereby strengthening this alternate path. 
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Thus, the decay in the pheromone level leads to redundancy, 
robustness and adaptivity, i.e., what some describe as emergent 
behavior [6]. 

Many optimization algorithms attempt to imaginatively cap- 
ture some notion of SI. Indeed, many difficult optimization 
problems have been solved by so-called ant algorithms such 
as the Traveling Salesman Problem, the Quadratic Assignment 
Problem and other A^P-hard optimization problems (see [6] for 
a large number of examples and citations). These algorithms 
generally utilize some analogue of pheromone or some simple 
stigmergic signalling mechanism. AntNet [12] for example, 
uses reinforcement learning to increase the probabilities of 
using certain routes in a routing algorithm. The probability 
value is used as an analogue to pheromone. Another example 
is in [13] which uses a similar update mechanism to control 
unmanned aerial vehicles. These different approaches all try 
to take advantage of how social insects seem to function. 
These attempts to implement some SI characteristic however 
often are forced to creatively sidestep the concept of self- 
organization and its implications. 

B. The Mystery of Self-Organization 

Although much has been learned from observations of these 
social insects, there is no widely agreed upon definition of 
what constitutes SI. Indeed, the term SI is bandied about so 
often and in such a wide variety of contexts that it causes 
confusion leading to many different interpretations and im- 
plementations for various problems. Although many of these 
implementations reflect some notion of SI, they often entail 
other paradigms as indicated above such as reinforcement 
learning [11], [14] and stigmergy which refers to complex 
indirect interactions based on simple signalling systems [6, 
p. 14]. Where does one paradigm end and the other begin? 
Still, other research seems to freely use the term SI when there 
simply are large numbers of interacting entities (see [15] for 
a complete discussion on competing paradigms). Does that 
alone suffice to describe or define SI? Is it merely a way of 
somehow taking advantage of parallel computing methods? 

Notwithstanding the many different descriptions offered by 
many researchers, the main features of SI seem to involve 
forms of limited or minimal communications and/or inter- 
actions, large numbers of interacting entities with limited 
reach, and some globally efficient, emergent or self-organized 
behavior [6, p. 9]. Bonabeau et al. [6, p. 9-11] suggests that 
the central features of SI are based on the manifestation of 
self-organization that arise from the interplay of four basic 
ingredients: 1) forms of positive feedback, 2) forms of negative 
feedback, 3) the amplification of fluctuations that give rise 
to structures, and finally 4) multiple interactions of multiple 
entities. 

But even this characterization provides very little insight 
into what SI is except in very descriptive terms. For example, 
it does not fully explain in a unifying way why or how 
pheromones evolved in the way they did, or why they should 
have the evaporative properties they have, or why they have 
their chemical makeup, or how ants can somehow distinguish 
among different types of pheromone (presumably in order to 
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identify ants from other colonies) [16, p. 156]. Pheromones are 
complex chemical signalling systems, yet most of the research 
that deals with them or models their effects use the concept 
in very limited ways, e.g., as a scalar in ant algorithms [14], 
[17] as opposed to a more complex scheme represented by 
vectors. Although, as we shall see, even simple scalars can 
possess enough information related to a measure of efficiency, 
pheromones are likely to have more complicated properties 
than their mere intensity [6]. Indeed, Wiener [16] in his 
ground-breaking book Cybernetics emphasizes the importance 
of intercommunication among the entities in question: 

How then does the beehive act in unison, and at 
that in a very variable, adapted, organized unison? 
Obviously, the secret is in the intercommunication 
of its members. . . This intercommunication can vary 
greatly in complexity and content ... the value of 
a simple stimulus, such as an odor, for conveying 
information depends not only on the information 
conveyed by the stimulus itself but the whole ner- 
vous constitution of the sender and the receiver of 
stimulus as well [16, p. 156-7]. 

However SI is described, one of its central characterizations 
is that of self-organization. But this also begs the question of 
what constitutes SI because there is no clear understanding of 
what self-organization or emergent behavior is! These terms 
have been around for some time and their definitions have 
been and probably will continue to be debated for some time. 

Serra et al. describes the concept of self-organization gen- 
erally as "highly organized behaviour even in the absence of 
a pre-ordained design." [18, p.l] (but what is 'organized'?). 
They go on to further describe examples such as the reso- 
nance phenomenon in lasers, and in cellular automata where 
"unexpected and complex behaviours [] can be considered as 
self-organized." [18, p.2]. 

Earlier, Prigogine 2 et al. [5] described self-organization in 
chemical reactions and thermodynamic contexts. His notion 
is quite enlightening and emphasizes the element of fluctua- 
tions in far-from-equilibrium system states. Non-linear system 
changes, fluctuations as he refers to them, due to either 
random events or chaotic dynamics, are then amplified by 
positive feedback mechanisms. This results in structures that 
emerge spontaneously which often are presented as examples 
of self-organization. It is interesting that Prigogine provides an 
example of SI based on the clustering behavior of termites in 
constructing termite nests [5, p. 18 1-186]. See also [6, p. 207- 
234]. Even earlier, Wiener [16] used the term in describing 
the brain waves of humans. See [16, Chap. 10: Brain Waves 
and Self-Organizing Systems]. Again, the notion of what self- 
organization is remains unclear. Is it merely some structure or 
pattern? And if so, how is one to distinguish it from merely 
random effects? 

What is needed to truly take advantage of SI is more than the 
mere descriptions of the attributes of SI and self-organization. 
Something amenable to formal mathematical definition would 
be quite valuable. This article offers a new and potentially 

2 Prigogine won the Nobel Prize in 1977 for his work on the thermodynam- 
ics of non-equilibrium system. 



SWARMING: NETWORK ENABLED C4ISR 

useful working definition of emergent or self-organized behav- 
ior for SI based on the imperatives of evolution and natural 
selection (as opposed to the imperatives of thermodynamics 
and irreversibility that elicited Prigogine's definition [5]). The 
concept of self-organization therefore plays a central role in 
the development of the foundational principles of SI. 

These principles are founded on the notion that self- 
organized behavior of the type observed in entities subject to 
the laws of evolution involves forms of efficiency in resource 
allocation. It emphasizes the importance of signals as mecha- 
nisms of change in system states subject to the imperatives of 
the evolutionary pressures of natural selection. It is important 
to recognize that the response of insect swarms to external 
stimuli is governed by processing systems that have been 
heavily influenced and affected by the forces of evolution. This 
allows consideration of a much richer spectrum of behaviors, 
both simple and complex, than those implied by the mere 
application of positive and negative feedback mechanisms to 
simple signals. In fact, the type of self-organization described 
here can be framed as a form of symmetry in that changes in 
the system's behavior and operating points nonetheless leave 
unchanged certain attributes associated with these operating 
points, namely, their Pareto optimality [19]. In short, a novel 
definition of self-organization presented here is system behav- 
ior that maintains its operating points on or near a Pareto 
optimal frontier. This notion of efficiency constitutes a central 
feature of the foundational principles for SI and is described 
in greater detail in the next section. 

III. The Importance of Basic Principles 

Developing a viable set of basic principles underpinning 
SI concepts is essential for successful implementations. The 
principles described here have a certain credibility and hence 
viability because they are based on how other successful 
theories appear to have been developed. This requires some 
discernment and creative articulation of the main features of 
how these other theories were, in fact, developed and how 
these features could be applied in the case of SI. The simulated 
annealing (SA) algorithm is a good example of this process 
and provides clues on how to articulate the corresponding 
components of the central principles underlying SI. 

A great deal of theoretical results on SA have been pub- 
lished in many papers and books (see e.g., [20], [21], [22], 
[25]). But this algorithm was not developed or analyzed in a 
vacuum. Its development was, in fact, based on the conflu- 
ence of certain factors. Three main factors or components of 
the research evolution are discernable. The first component 
consists of a set of first principles based on the relevant 
laws of nature and their implications. The second component 
is an appropriate dynamical framework. Finally, the third 
component is an appropriate problem framework. These three 
components form the core of what can be referred to as a set 
of foundational frameworks. To see how these components 
provide guidance in developing a viable research strategy for 
SI, an understanding of how they worked together in the case 
of SA is helpful. 

In SA, the first principles were based on the laws of ther- 
modynamics. The most salient implication of this law was that 
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the entropy of a thermodynamic system, the randomness of its 
states, tends to its maximum value at any given temperature. 
Metropolis [23] utilized this idea in developing a method 
for simulating the evolution of a complex thermodynamic 
system. The underlying law of thermodynamics was manifest 
in the objective function of a non-linear program (NLP). The 
objective of this NLP was to maximize the entropy of a 
stationary Markov chain subject to certain constraints (such 
as the the constraint that expected energy level is proportional 
to temperature) [21]. 

The solution of this NLP lead to an expression for the 
stationary probability distribution (the Boltzmann Distribution) 
of each system state [21]. Thus, the implication of the first 
principles was the expression of the Boltzmann Distribution. 
What remained was to somehow develop an appropriate 
mechanism that described how this thermodynamic system 
changes from one state to another while at the same time 
remaining consistent with the maximization of entropy, i.e., 
the Boltzmann Distribution, as dictated by the fundamental 
laws of thermodynamics. 

Thus, in addition to this fundamental law, a dynamical 
framework was necessary to address the problem of how to 
model a system that undergoes some type of change. The 
well-known mathematical framework of Markov chains and its 
global and detailed balance equations provided those necessary 
elements that mathematically describe the stochastic nature of 
the system as it changes from one state to another while still 
obeying the fundamental laws of thermodynamics [20], [22], 
[23]. It was this utilization of the detailed and global balance 
equations that ultimately lead Metropolis et al. to his now 
famous Metropolis Acceptance Criterion, the main result of 
Metropolis' contribution. In effect, Metropolis 'engineered' a 
transition probability to fit within the framework of global and 
detailed balance equations given the stationary probability, i.e., 
the Boltzmann Distribution. 

Once these transition probabilities were defined, the math- 
ematical description of how a large, thermodynamic system 
evolves from one state to another in a computer simulation 
was complete [23]. Once this method of simulating a thermo- 
dynamic system was in place (in 1953), the idea of reducing 
temperature slowly to "anneal", conceived by Kirkpatrick [24] 
some 30 years later, was the final element that lead to the SA 
algorithm. This algorithm has since been fully mathematically 
described and analyzed by a host of researchers 3 (see [20], 
[22], [23], [24]) using various combinatorial optimization 
problems (COPs) as the problem framework — a general way of 
characterizing a whole host of discrete optimization problems. 
Thus, the dynamical and problem frameworks provided the 
mathematical constraints and research focus in a way that was 
consistent with the laws of thermodynamics. 

To summarize how these three components of the founda- 
tional principles, the meta-formalism, describes the evolution 

3 Although the algorithm seems to have been fully analyzed, this proposer 
believes there are still a number of fundamental theoretical elements that 
await discovery. See e.g., Fleischer [25] for a description of scale invariant 
properties of the SA algorithm. These properties illustrate some potentially 
important symmetries in SA and may play a role in research involving a group 
theoretic approach for modelling SI. 
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of SA theory, the first principles were based on thermody- 
namics manifest as the maximization of the entropy in an 
NLP that lead to the definition of the Boltzmann Distri- 
bution. The dynamical framework was based on the global 
and detailed balance equations of Markov chains which lead 
to the Metropolis Acceptance Criterion and the transition 
probabilities in SA. Finally, the problem framework was based 
on COPs which permitted the application of the ideas onto test 
bed problems for experimentation and analysis. A successful 
and useful set of formalisms can therefore be modelled on this 
developmental process 4 hence ought to involve first principles 
based on fundamental laws, an appropriate dynamical frame- 
work, and an appropriate problem framework. Each of these 
three components can be recast to within the context of SI. 
This is described in more detail in the next three sections. 

A. First Principles: The Laws of Evolution 

Any successful theory requires some reasonable and widely 
accepted assumptions on which the theory is based. Because 
the SI paradigm is based on observations of social insects, 
i.e., living creatures, the first principles ought to based on, 
involve, or at least be consistent with, the laws of evolution 
and its implications. Basically, this theory states that species 
evolve over generations constantly improving their fitness to 
survive through the process of natural selection and genetic 
mutation. Indeed, this theory forms the basis of GAs and its 
associated theoretical results [27], [26]. But this theoretical 
statement by itself seems insufficient to establish foundational 
principles for SI. Some reasonable implications of this theory 
are needed. 

One reasonable and important implication of the theory 
of evolution as it relates to swarms of insects is that it 
selects out those systems that exhibit efficient behavior on 
many scales. This notion of efficiency is not only reasonable, 
but is also based on observations of social insects and how 
they adapt to changing environmental conditions (see [26]). 
Indeed, it seems basic that an efficient allocation and use of 
resources provides a distinct survival value to any species. 
Thus, a reasonable implication of evolution regarding the 
behavior of social insects is that it lead to behaviors that are, 
in some measure, efficient. These notions of efficiency and 
adaptability have an important mathematical formalism based 
on the concept of Pareto optimality. 

1) Efficiency Via Pareto Optimality: Optimization problems 
are ubiquitous in the real world and social insects must also 
deal with a variety of them if they are to survive. Certainly, 
the efficient allocation of resources present problems where 
some goal or objective must be maintained or achieved. Such 
goals or objectives are often mathematically modelled using 
objective functions, functions of decision variables or parame- 
ters that produce a scalar value that must be either minimized 
or maximized. The challenge presented in these often difficult 
problems is to find the values of those parameters that either 
minimize or maximize, i.e., optimize, the objective function 
value subject to some constraints on the decision variables. 

4 We believe that other theories such as those associated with genetic 
algorithms (GAs) offer a similar developmental model. 



Most complex systems, however, have several objectives 
that must be considered when assessing overall system per- 
formance. Operations of these systems sometimes leads to 
conflicting objectives where one objective must be efficiently 
"traded off" for another. In these multi-objective optimization 
problems (MOPs) system efficiency in a mathematical sense 
is often based on the definition of Pareto optimality-a well- 
established way of characterizing a set of optimal solutions 
when several objective functions are involved (see e.g., [28] 
and the citations therein). 

In this perspective, each operating point or vector of deci- 
sion variables (operational parameters) produces several objec- 
tive function values corresponding to a single point in objective 
function space (this implies a vector of objective function 
values). A Pareto optimum corresponds to a point in objective 
function space with the property that when it is compared to 
any other feasible point in objective function space, at least 
one objective function value (vector component) is superior to 
the corresponding objective function value (vector component) 
of this other point. Pareto optima therefore constitute a special 
subset of points in objective function space that lie along 
what is referred to as the Pareto optimal frontier-the set 
of points that together dominate (are superior to) all other 
points in objective function space [29]. See also [30] and [31] 
for descriptions of multi-objective GAs (their intriguing titles 
notwithstanding do not encompass the ideas put forth here on 
SI). 



Fig. 1. The Pareto Optimal Frontier 

An important characteristic of Pareto optima, this frontier 
of points, is that they correspond to operating points where 
the improvement of one objective function value comes only 
at the cost of worsening some other objective function value — 
trading off one for another. Figure ^ illustrates this set of 
points in objective function space by the open O's where each 
operational decision vector produces two objective function 
values /i and /2 that are minimized. It is along this set 
of points that operational decisions must be restricted if 
operational efficiency is to be maintained. It is easy to see 
that the Pareto frontier dominates (is superior to) the other 
points in this 2-dimensional objective function space. 

Effective methods for determining several Pareto optima can 
be quite valuable for enhancing the survival value of a species 
(or managing a complex system) because it enables adaptive 
behavior. This allows the possibility of efficient operations 
when there are changes in the relative utility of the several 
objectives or when one objective must be restricted to a certain 
range of values. Under such circumstances, the system can be 
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moved along the Pareto frontier from one Pareto optimum to 
another thereby maintaining operational efficiency. Thus, if in 
an ant colony a path to a food source becomes congested, then 
other routes must be utilized. Although the distances to food 
sources are generally minimized as is the level of congestion, 
these often conflicting objectives can be efficiently traded off 
when the shortest distance is sacrificed to lessen the level of 
congestion [6]. 

To summarize this first component of the foundational 
principles of SI, the first principles and its implications sug- 
gest that both subsystems and system should behave in a 
manner consistent with Pareto optimality. What remains is to 
characterize Pareto optimal behavior in a context reflective 
of SI. It is noteworthy, and somewhat surprising, that these 
ideas pertaining to Pareto optimality and self-organization have 
apparently not been pursued or even articulated by others in the 
context of SI research! See e.g., the very complete books on 
SI such as [6]. Assuming that natural selection pressures have 
imposed behaviors leading to efficient, adaptable, i.e., Pareto 
optimal behavior, the question of how this is achieved in social 
insect societies must be addressed. What are the mechanisms 
that social insects use to solve what amount to MOPs with the 
limited forms of communication they seem to have? 

It may be that notions of scale are involved. Indeed, the 
concept of stigmergy suggests that what has been regarded as 
collective intelligence or emergent behavior on a large scale 
is effected by decisions on a smaller scale [6, p. 14]. But 
how is such a relationship among different scales possible 
when decisions affecting the local environment are based on 
apparently simple information? How can system change be 
mathematically described in a way consistent with movement 
along the Pareto optimal frontier? More significant however is 
the notion of efficiently trading off the efficiencies on one scale 
with efficiencies on another scale. Addressing these issues 
mathematically is the purpose of the dynamical framework 
described below. 

B. The Dynamical Framework: Promoting Adaptability 

The purpose of the dynamical framework is to provide 
guidelines for how to mathematically model the dynamics of 
the system in a way that is consistent with the first principles 
and faithfully reflects the desired phenomenon. Its articulation 
provides the necessary research and development focus for 
describing system dynamics. This allows specific questions to 
be asked that lead to specific tests, mathematical constraints, 
and the like. In essence, it facilitates the engineering of the 
dynamical rules that, in this case, lead to Pareto optimal 
behavior. It is the essential feature that allows this type of 
research to proceed. 

For SI, the dynamical framework must be based on the 
two intertwined principle features of SI — its emergent or self- 
organized behavior, and the minimal information process- 
ing and communications capabilities of the entities involved. 
These two rather vague notions must somehow lead to more 
well-defined properties and characterizations. The preceding 
section suggests that the emergent behavior can be charac- 
terized by Pareto optimal operating points in some decision 
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space. The additional requirement that such efficient behavior 
arises from limited communication and information, in effect, 
imposes some scaling aspects. These are addressed by the 
Scale Invariant Pareto Optimality (SIPO) concept described 
in the next section. 

1) Scale Invariant Pareto Optimality: The SIPO concept 
stems from a belief that evolution has lead to behaviors of 
social insects that promote the entire species. Somehow over 
the eons of evolution, insects societies have developed rules 
of conduct that are biologically driven and lead to efficient 
management of their environment and resources [6, p. 30]. Yet, 
given their apparently limited capacity for communication and 
information processing, it seems rather mysterious that in spite 
of these limitations, their behaviors seem almost guided by an 
invisible hand 5 that somehow efficiently manages an entire 
colony's resources! How is this possible? As stated earlier, 
system movement along the Pareto optimal frontier lends 
itself to adaptable and efficient behavior thereby increasing 
the survival value of these social insects. The heart of the 
matter in SI therefore becomes: 

How can an entire system be moved along the Pareto 
optimal frontier when the entities that comprise the 
system have only limited information available to 
them on which to base their own behaviors and 
where their behaviors have only limited effects on 
their environment and other entities? 

The SIPO concept attempts to answer these types of 
questions. Simply put, it is a way of characterizing Pareto 
optimal solutions on many scales and thus reflects efficient 
and adaptable system behaviors on many scales. It provides 
a basis for defining properties of systems that are reasonable 
and have been observed in social insect colonies. Evolution can 
be seen as imposing not just behaviors that enhance survival 
of individual entities, but also of groups or societies of these 
entities [6, p. 30-1]. Thus, while ants must manage their own 
survival, they also derive distinct advantages from participating 
in a colony. Indeed, scientists have observed behaviors in ant 
colonies that in many respects is efficient on several different 
scales involving sub-societies or sub-colonies (see [6, p. 209] 
for a discussion on nest modularity). There are divisions of 
labor, aspects of specialization and such that contribute to 
the overall survival value of an entire colony [6, p. 2]. This 
interdependency seems to promote their collective, and hence 
individual, survival. 

Since the SIPO concept revolves around the concept of 
Pareto optima 6 it is necessary to fully characterize its mathe- 
matical attributes. So far, only its definition has been described. 
The next few paragraphs describe another important aspect of 
Pareto optima. 

5 Their economies like those of humans may indeed be guided by Adam's 
Smith's invisible hand, i.e., swarm intelligence! 

6 It is worth noting that this description of SIPO concepts seems, on the 
surface, to be similar to those articulated by others. Menczer [31] e.g., 
describes an evolutionary algorithm that seeks to determine Pareto optima and 
has some scaling properties. But this article does not describe the adaptable 
behavior of entities in terms of movement along a Pareto frontier. Coello 
et al. [30] also provides an intriguing title, but their paper also deals with 
evolutionary optimization schemes for solving MOPs and does not address 
issues pertaining to SI. 
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a) The Measure of Pareto Optima: A rather intuitive 
yet surprisingly little known aspect of Pareto optima is its 
measure. This measure is based on the size of the set of 
points in objective function space that are dominated by the 
Pareto optimal frontier — in essence a Lebesgue measure or 
hypervolume. Zitzler [32] was apparently the first to publish 
this idea in a brief three sentence description. 7 Fleischer [28] 
extended this idea and described a set function that maps a 
set of points in objective function space to a scalar value, 
the Lebesgue measure, and formally proved that this scalar 
achieves its maximum value if and only if its arguments are 
Pareto optima. Fleischer also defined an efficient (polynomial) 
algorithm for calculating this scalar for an arbitrary number 
of dimensions (objective functions). 




Fig. 2. The hypervolume of Pareto optima. 

This hypervolume is illustrated by the shaded region in 
Figure [2] for two minimizing objective functions f\ and f% 
(see [29] for more complicated illustrations in 3 dimensional 
space). The dotted lines are upper bounds on f\ and /2 while 
the black dots correspond to feasible points in the underlying 
decision space. Because this set function produces a scalar, 
it can be used as an objective function in an optimization 
algorithm such as SA. The proof that if this hypervolume is 
maximized then its arguments are Pareto optima 8 implies that 
its use in SA can induce it to converge in probability to the 
Pareto optima! 9 

One important aspect of this measure is that it quantifies 
efficiency, i.e., changes in the overall efficiency can be mathe- 
matically captured by a changes in the scalar hypervolume and 
depicted by a receding or expanding Pareto optimal frontier 
(see below for a further discussion on efficiency tradeoffs). 
This measure and its potential roles in SI are described in the 
next sections. 

b) Subsystem Interactions and Constraints on Objective 
Functions: The SIPO property suggests that local efficiency 

7 This idea was independently formulated by the author who later discovered 
that Zitzler had already described it in a brief three sentence passage. 
The author was therefore compelled to provide a formal proof and some 
algorithmic extensions. 

8 This direction of proof of the if and only if implications is much more 
difficult, but is necessary in order to justify its potential use as an objective 
function in a metaheuristic such as SA where the mechanism of search 
is crucially dependent on the objective function value, e.g., the Metropolis 
Acceptance Criterion the goal of which is to improve the objective function 
value. 

9 The author is not aware of any other multi-objective optimization scheme 
that theoretically converges in probability to Pareto optima. 



may often be sufficient for system-wide efficiency. Articulating 
and applying this concept to systems of entities therefore 
imposes constraints on how subsystems can interact, using 
simple forms of information, so that a system, be it an ant 
colony, a swarm of chemical sensors, or a network of computer 
systems, operates on, or close to, the Pareto optimal frontier on 
both local and global scales. How this can be accomplished is 
non-trivial because several identical subsystems may operate 
on different Pareto optima in terms of local performance 
metrics. 

Consider the following simple example where two, identical 
subsystems have operating points (vectors of parameters or 
decision variables) on one of two local Pareto optima xi and 
X2. These two operating points lead to points in objective 
function space pi = (/(xi), g(xi)) = (fx,gi) and 

p 2 = (/(x 2 ),5(x 2 )) = (/2,52) for two local objective 
functions / and g, where, for purposes here, both are min- 
imized. Let functions F and G be global objective func- 
tions each taking two arguments, one from each subsystem. 
Assuming the order of arguments makes no difference in 
these functions, there are three distinct ways in which these 
two subsystems can operate on local Pareto optima: two 
ways where both subsystems operate on the same point in 
local objective function space, i.e., (i r (xi,xi),G(xi,xi)) 
and (i 7 '(x2, X2), G(x2, X2)) and one way in which the 
subsystems operate on different local Pareto optima, i.e., 
(F(xi, X2), G(xi, X2)). How should the global objective func- 
tions F and G be defined in terms of local objective functions 
so that all three combinations of local Pareto optima map into 
the set of global Pareto optima? 

More generally, under what circumstances do different com- 
binations of local Pareto optima associated with subsystems 
yield a global Pareto optimum? Put another way, what is the 
form of objective functions that allow mappings of Pareto op- 
timal points on the sub-system level, to Pareto optimal points 
on the system-wide level? This scaling from the sub-system 
level to a system-wide level may therefore impose certain 
additivity properties or exponential forms in the measures of 
performance and presents separability issues. For example, 
when must F(xi,X2) = /(xx) + /(X2)? This is another 
example of how the SIPO concept narrows down the playing 
field and sharpens the questions that need to be asked and 
answered in any development work involving SI concepts. 

Although it is desired that subsystem states be maintained 
along a local Pareto optimal frontier at the same time system- 
wide states are maintained on a global Pareto optimal frontier, 
the possibility exists that this may not always be possible 
depending upon how these objective functions and their re- 
lationships are defined. Afterall, subsystem states affect the 
system-wide objective functions which are also affected by 
large numbers of other subsystems. Moreover, subsystems 
do not behave independently. The resulting complexity is 
similar in many ways to the complexity of cellular automata 
(see below). The following section describes how can these 
constraints can be captured in a mathematically useful way 
that still reflects efficiency. 

c) Efficient Tradeoffs of Efficiency Between Scales: 
Sometimes the benefits to the individual entity must be traded 
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off for the good of the entire group — e.g., soldiers must often 
make the ultimate sacrifice to ensure the success of their unit 
(or nation). 10 This suggests the possibility that certain states 
of neighboring subsystems maybe 'incompatible'. That is, a 
certain Pareto optimum in one subsystem may preclude a 
particular Pareto optimum in another (similar to Ising spin 
glass models — see the section below on Markov Random 
Fields). Thus, it is conceivable that only certain combinations 
of Pareto optima among neighboring subsystems may coexist 
at any given time. If this is true, it may explain the notion of 
fluctuations as Prigogine calls them [5] and the formation of 
structures as an attribute of self-organization. 

If such incompatibilities exists between and among neigh- 
boring subsystems or between subsystems and system-wide 
Pareto optima, then it must be the case that there is some trade- 
off between the efficiency measures of individual subsystems 
and/or the efficiency measure of the entire conglomeration of 
entities. Some subsystems may have to sacrifice efficiency 
in order for other subsystems to operate Pareto optimally. 
Recall from the earlier discussion that the measure of Pareto 
optima is a quantification of the efficiency of a system. 
Thus, a mathematical way of characterizing a tradeoff of 
efficiency between two subsystems or between two scales of 
systems is to use their respective measures of efficiency to 
define a Pareto optimal frontier based on them. In effect, 
the scalar hypervolumes associated with Pareto optima on 
different scales themselves become objective functions for 
a Pareto optimal frontier! To the author's knowledge, this 
type of tradeoff has never been characterized or described in 
this fashion, hence provides a novel way of characterizing 
subsystem to system interactions. Figure [5] depicts such a 
tradeoff curve where the Pareto optimal frontier is the bolded 
part of the curve. Note that improvement on one scale comes 
at the cost of worsening the hypervolume on the other scale. 
Note also the concave form of the curve which is based on 
the ideas that follow. 11 

d) Structures and Self-Organized Behavior: The descrip- 
tions of self-organized behavior described earlier often entail 
the notions of structures or patterns as the main charac- 
terization of self-organization. The foregoing discussion on 
efficient tradeoffs of efficiency measures provides another 
way of describing how such structures and their precipitating 
fluctuations arise. 

First, some random configuration among the subsystem or 
entities occurs. Next, the subsystems all seek their local Pareto 
optimum. Finally, in the case where some subsystems cannot 
attain a Pareto optimum in terms of local objectives because 
of state incompatibilities among neighboring subsystems, these 
subsystems attempt an efficient tradeoff of efficiency measures 
as described above. These tradeoffs are between an estimate 
of the measure associated with the global Pareto frontier 
and an estimate of the measure associated with the local 

10 This type of behavior has also been observed in insect colonies where 
soldier ants and worker bees sometimes sacrifice their lives for the good of 
the entire colony. See [6, p. 36-9]. 

1 1 It should be borne in mind that the two axes depicted, one for subsystems 
and the other for the entire system are not independent, but this simple curve 
captures the essential idea of the efficient tradeoffs of efficiency measures. 
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Pareto frontier. Thus, as the subsystems and system evolves, 
neighboring subsystems may evolve in a way guided by these 
tradeoffs of efficiency measures. Thus, the state incompatibil- 
ities ultimately give rise to structures as subsystems attempt 
to operate on their local Pareto optimal frontier. The patterns 
of states are therefore governed by their initial conditions, 
the various state incompatibilities in terms of Pareto optima, 
and the efficient tradeoffs of efficiency measures in the midst 
of these subsystem state incompatibilities. The patterns or 
structures observed in insect swarms can therefore possibly 
be explained in these terms. 

e) The 'Graceful' Degradation of Performance: The 
foregoing discussion also suggests a general and mathemat- 
ical way of describing the "graceful degradation" of system 
performance. Such graceful degradation has been seen as 
an important component in the management of large scale 
systems such as the Internet [2, Ch.5]. Rather than suffering 
catastrophic changes to a system, it may be possible for system 
efficiency to be degraded in a more gradual manner. This is 
because system efficiency can be measured using the Lebesgue 
measure described earlier, hence the measures for different 
scales can be traded off efficiently! This idea of efficient 
tradeoffs of efficiency measures provides a quantifiable guide 
of how best to achieve this sacrifice of efficiency on one scale 
for improved efficiency on another scale. These types of issues 
can arise in a variety of contexts. 




Sub-system Efficiency ► 

Fig. 3. A tradeoff curve of different measures of efficiency. 

For instance, during the operation of a complex system it 
may happen that environmental changes, or changes in system 
state results in a subsystem-system configuration depicted in 
Figure 13 Here, the axis is an estimate of a particular sub- 
system's efficiency measure and the y— axis is an estimate of 
the global efficiency measure. The black dot corresponds to the 
subsystem/system efficiency point under current environmental 
conditions. These conditions suggest a tradeoff of efficiency 
is possible where the system's efficiency can be improved at 
a cost of decreasing the subsystem efficiency, i.e., the black 
dot is moved in the direction of the arrow. 

Obviously, if the efficiency measures of enough subsystems 
are degraded, it becomes increasingly difficult to compensate 
for their degradation. Eventually, the entire system's efficiency 
measure will be reduced, but the actions of the subsystems in 
establishing their movements along this Pareto optimal frontier 
of these efficiency measures, provides for the most graceful 
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degradation of efficiency — because the degradation occurs 
efficiently! How this occurs poses an interesting and, needless 
to say, difficult problem owing to the many subsystem interac- 
tions with the system, i.e., the complexity problem and the fact 
that these efficiency measures are not independent. Without 
be able to quantify efficiency with this scalar hypervolume, 
however, it would be impossible to address this problem of 
graceful degradation in any mathematical way that provides 
some guidance and direction for how to achieve this. This 
approach of characterizing the subsystem to system efficiency 
tradeoff therefore has many applications to both information 
networking, understanding the economics and management of 
such systems, as well as economics in general! 

SIPO is a new way of articulating emergent behavior in 
the context of systems operating under the imperatives of 
natural selection. It provides a way to mathematically char- 
acterize properties of systems on many scales and imposes 
constraints on how the subsystems interact and functionally 
affect global measures of performance. This makes imple- 
mentation of efficient systems more realistic, provides some 
underlying structure from which to investigate mathematical 
forms for objective functions, has the appealing feature of 
scale invariance and makes sense from an evolutionary point 
of view. But can such properties arise in systems with limited 
forms of communication and information processing? The 
following section provides a clue, based on recent research, 
that says it is indeed possible for Pareto optima to at least be 
associated with limitations on the form of information. 

2) Limited Signalling Systems and Pareto Optimality: The 
nexus of operational efficiency and SI seems to stem from 
how the swarms of entities, i.e., sub-systems, communicate 
and interact with each other and their environment. If we are 
to assume that biological systems have evolved over time to 
operate efficiently, then the signalling systems used by social 
insects must have some mathematical connection to multi- 
objective optimization and the concept of Pareto optimality. 
Moreover, such a signalling system must be simple in some 
sense. The SI paradigm as indicated in the literature is based 
on the use of very simple (and apparently low-cost) signalling 
methods. Study of ant colonies for example, shows that 
entire social systems operate efficiently using simple chemical 
signalling systems based on pheromones. These pheromones 
constitute a conceptually simple signalling system and provide 
a great deal of useful information, e.g., temporal information 
since the strength of the pheromone decays over time by the 
process of evaporation ([6, p.43]) in addition to memory of 
where the ants have been. By interpreting these chemical 
signals and responding to them with simple behaviors, ants 
perform complex functions efficiently. 

The idea of using a simple signal to convey information 
that leads to efficient behavior is somewhat remarkable and 
the question therefore arises as to whether a simple signal 
can even be associated with Pareto optima. The answer to 
this question is of course that it is possible in the sense 
that an entire set of Pareto optima can be measured using 
a single scalar value, the hypervolume or Lebesgue measure 
as described earlier (in the paragraph on the measure of Pareto 
optima). Certainly, a single scalar quantity qualifies as perhaps 
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the simplest signal [33]. Thus, a strong connection between 
Pareto optima and simple signals is possible. In effect, this 
scalar may constitute a mathematical analogue of pheromone! 
This scalar quantity not only provides the foundation for a 
whole host of heuristic methods that can reveal Pareto optima 
(or near Pareto optima, see [28], [29] 12 ), but may also enable 
the operational movement along a Pareto optimal frontier. 

Pheromone may therefore be nature's way of mapping an 
entire set of Pareto optimal points to a single scalar quantity 
and suggests mathematical constraints on the form of functions 
that mimic pheromone. Thus, just as maximizing entropy 
served as a guide in developing Metropolis equations of state 
[23] that ultimately lead to SA, maximizing this Lebesgue 
measure may serve as a guide in developing appropriate 
signalling systems to mimic pheromone and how other related 
quantities should change over time or in relation to other 
quantities. This is the heart and purpose of the dynamical 
framework. 

a) Swarm-based Solution Methods: One important con- 
sideration in this discussion on Pareto optima, efficiency and 
limited communications is how to tie it all together. How can 
one effectuate solution methods involving simple signals that, 
in effect, solve MOPs? 

One way alluded to earlier is to use the SA algorithm. One 
of its parallel forms, referred to as cybernetic optimization by 
simulated annealing (COSA), uses scalars that are communi- 
cated among processors to modulate the temperature control 
parameter [34], [35]. These scalars provide a feedback con- 
trol system that improves SAs performance. Like many ants 
signalling each other with simple signals, parallel processors 
running COSA communicate using simple signals in a way 
that enhances their collective performance. 

This form of parallel computation as many similarities 
to other parallel forms such as Tabu Search and so-called 
"random restart local search" methods [27]. These methods all 
share a number of features in common with how swarms of 
insects solve problems e.g., by exploring many different paths 
simultaneously (see [6, p. 55]). In particular, each processor in 
a parallel system starts at some randomly selected point in 
decision space and attempts to find the global optimum from 
that point. With many such processors, the chances of finding 
the global optimum in a reasonable amount of time increases. 

b) Transformations and Movement Along the Pareto Op- 
timal Frontier: An important aspect of multi-objective opti- 
mization problems is that the vectors of decision variables 
that produce the Pareto optima can be scattered throughout 
the decision space (the domain) in an apparently random or 
haphazard manner [28], [29]. This is due to the interplay 
among the several objective functions — Pareto optima may lie 
on the sides of hills rather than at their tops or bottoms (the 
local minima or maxima). Thus, it is very difficult to determine 
where these decision variable are in the decision space. Some 
appropriate method to determine the Pareto optima and their 
corresponding vectors of decision variables is necessary (see 
[28]). 

12 Note that these results have also detailed efficient algorithms for calcu- 
lating the scalar associated with this set function for any number of objective 
functions. 
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It bears emphasis that two Pareto optima that are close 
to one another in objective function space may have their 
underlying vectors of decision variables quite widely separated 
and vice versa. Thus, assuming that a gradual change in 
objective function space is desired, such as when there is a 
gradual change in the relative utility of the different objective 
functions, some method of moving from one vector of decision 
variables to another such vector some distance away must be 
devised. In other words, once the Pareto optima are identified, 
some method of transforming one Pareto optimal decision 
vector to another is required. 

One approach for performing such a transformation is 
through the use of feedforward neural networks [36, Ch.2]. 
Such networks can be trained so that a Pareto optimal deci- 
sion vector in the input produces the desired Pareto optimal 
decision vector at the output. Of course, a great deal of further 
research and development in this context is necessary. 

This dynamical framework based on the SIPO concept and 
its relationships to emergent behavior, efficiency, and simple 
signalling systems (as suggested by the Lebesgue scalar), 
provide a framework that may enable novel approaches to SI 
research and development in ways consistent with the first 
principles described earlier. Together, the two main compo- 
nents of the SI paradigm, self-organization as indicated by 
movement along a Pareto optimal frontier and the minimal 
signalling systems as indicated by the Lebesgue measure for 
Pareto optima, provide a mathematical foundation from which 
fundamental theories and principles of SI can be developed 
because they are more precisely defined and characterized than 
the rather ad hoc approaches currently in vogue that attempt 
to capture some aspect of SI. 

To fully accomplish this potential however, requires some 
way of applying these ideas to concrete and abstract problems, 
i.e., test bed problems that facilitate the testing, experimenta- 
tion and ultimately the application of these ideas. The prob- 
lems used must have some potential for exhibiting emergent 
phenomena and at the same time allow for modelling the 
simple forms of interaction, the stigmergy, that is the hallmark 
of SI. The problem framework, the last of the three formalism 
components, provides these important elements for SI research 
and development and is described in the next section. 

C. The Problem Framework 

An important aspect of the problem framework is that it 
allows specific questions to be asked and problems explored. 
It must be sufficiently general to not only capture the essential 
elements of the phenomenon, but also offer some avenues on 
which to implement the dynamical framework. It must also be 
specific enough to allow useful and appropriate mathematical 
tools to be brought to bear. The problem framework described 
here attempts to achieve this by describing an abstract problem 
type based on what is termed here as swarming finite state 
machine (SFSMs) models. 

1) Swarming Finite State Machine Models: The idea be- 
hind SFSMs is that notwithstanding the limited capabilities 
of swarms of entities, we should nonetheless provide some 
way of flexibly describing their level of complexity. Afterall, 



© 2003 BY MARK FLEISCHER. ALL RIGHTS RESERVED. 10 

human societies present self-organized behavior and certainly 
forms of 'swarm intelligence'. The allusion to Adam's Smith's 
"unseen hand" ([37]) and Alvin Toffler's social code based on 
society's propensity for concentration, centralization, special- 
ization, standardization, synchronization, and maximization 
(see [38]) suggest that the central features of SI may be 
ubiquitous. Are these not attributes of very complex forms of 
swarm intelligence? What is needed then is a way to address 
a continuum of complexity and swarm intelligence. 

SFSMs allow us to do this in that each finite-state machine 
(FSM) has a finite set of inputs, outputs and internal states that 
provide memory, the potential capacity for learning (where 
the transitions to states are affected by learning methods-see 
[39]) and certainly the complexity that we hope to manipulate 
and gain insight about. Moreover, the level of complexity can 
be changed by simply changing the numbers of states that 
comprise the SFSMs. 

SFSMs form a type of cellular automata that when coupled 
with appropriate transformation functions and objective func- 
tions to reflect operation along a Pareto optimal frontier will 
exhibit the type of self-organized behavior defined in Section 
IH-BI The states of each FSM will have some functional depen- 
dence on neighboring FSMs and will be designed to stabilize 
on efficient operating points. This may entail using genetic 
algorithms (GAs) or involve neural networks to search the 
space of transition functions or train on prescribed operating 
points so as to induce the FSMs to operate on Pareto optimal 
frontiers. 

These types of simulations provide a real microcosm of 
the SI world. It can effectively test whether natural selection 
pressures are consonant with efficient operating points in 
the manner described here. Fitness functions related to the 
efficiency measures could be used to induce the SFSMs to 
achieve this. It would indeed be interesting to see how these 
SFSMs can then be designed (or evolved) to efficiently handle 
tradeoffs of efficiency measures, a cornerstone of the SIPO 
concepts described earlier. 

The use of SFSMs allow for abstractions and, ultimately, 
practical implementations. The level of complexity can be 
controlled in the sense that the size of the FSMs, its number of 
states, inputs and outputs, can be decided upon a priori. The 
state space of the SFSMs, how its transformation functions 
are established and how they interact of course require further 
research, but this problem framework and the other two for- 
malisms mentioned earlier provide an important structure on 
which real swarm intelligence can be studied and implemented 
for real-world problems. 

It is worth pointing out, however, that much of what has 
been described, the properties and characteristics and such, 
have features in common with many other areas of inquiry 
in the sciences and engineering. The following areas describe 
some of these features that offer some potentially useful tools 
and ideas for developing SI systems. 

D. Other Scientific/Engineering Tools 

1) Group Theory: The structure of the emergent or self- 
organized behavior in SI based on the meta-formalism, i.e., 
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movement along the Pareto optimal frontier, provides a num- 
ber of methodologic and analytical advantages for exploring 
the fundamentals of SI. The most significant aspect of this 
structure is that it may permit the use of group theory in 
developing the fundamental theories of SI. Group theory 
provides powerful mathematical tools as well as the distinct 
possibility of developing novel insights. 

This perspective requires a number of reasonable assump- 
tions and conceptions. A basic conception is that we can model 
each entity i in a swarm as a subsystem that exists in a number 
of different states Xi = (xa, . . . , Xi n ) defined by n parameters 
corresponding to values associated with the local environment 
as well as the status of the agents themselves. The social 
insects, the ants for example, can be viewed as the agents 
of change in that they modify the states of these subsystems. 
The states of each subsystem also affect the states and agents 
of neighboring subsystems, hence is conceptually similar to 
cellular automata described below. 

The actions of these agents of change can be modelled as 
transformations on the states of each subsystem. Let x' denote 
the resulting state from the operation of a transformation 
function Ti. Now Ti may, in effect, be a function of not just 
the state of entity i, i.e., x i7 but also the states of neighboring 
subsystems. Thus, Ti(xi) = /(. . . Xj, x i+ i . . .) where 
the states of neighboring subsystems are explicitly denoted 
and / is a vector-valued function. Subsequent states, those 
resulting from the action of the agents of change can be related 
to the current states by 

X- = /(• . .Xi-i.X^Xi+i . ..) = /(Xi) 

for some / G F{ that is functionally dependent on the states of 
the neighbors of i. Because of our first and second formalisms, 
these transformations must restrict the output states to the 
set of Pareto optima P or near Pareto optima P', i.e., x' 6 
S = P U P'. In effect, the neighboring subsystems select 
transformations from a family Fi of transformations all of 
which map one point in S onto another point in S. Assuming 
the existence of identity and inverse transformations for each 
such point, the set Fi constitutes a symmetric group ([19]) 
over the states S. 

2) Cellular Automata: Complexity theory, in particular 
theories regarding cellular automata (CA) share a number 
of features with SI (see [6, p. 245]). CAs are characterized 
by a set of cells usually arranged in some geometric pattern 
(although this is not technically required) and connected to 
other similar cells by some neighborhood rule. Each cell has 
a finite number of states which are determined by the states 
of its immediate neighbors. State transitions are defined by 
simple rules, a central component of cellular automata, that 
produce complex patterns often described as self-organized 
[36], [3]. The distinctions between SI and CA as proposed 
here involve the connection between the emergent behavior 
and Pareto optimality. CAs usually do not involve objective 
functions per se. It is also worth noting that there is a deep 
connection between CA and group theory [3, Ch. 3: Group 
CA Characterization]. Chaudhuri, et al. [3] provide a number 
of theoretical results on the connection between CA and cyclic 
groups and other group properties. 
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It is also worth noting that the transformation functions 
that determine how each cell evolves are usually very sim- 
ple functions — somewhat analogous to the limited forms of 
communication associated with SI and stigmergy. But in SI, 
the relationships among the entities is dynamic owing to the 
mobility of the agents of change. Entity relationships are 
not fixed as in CAs. Notwithstanding this distinction, SFSMs 
described above can model this mobility in a way consistent 
with a CA paradigm by incorporating system states that model 
changing environmental conditions. 

3) Multi-Objective Optimization: One important aspect in 
making a connection between self-organization and Pareto 
optima is how these Pareto optima are identified. By creatively 
utilizing the Lebesgue measure, SA and other heuristics can be 
engineered to search for these operating points in a very direct 
way, by maximizing this Lebesgue measure. This can be done 
using the COSA concept which happens to incorporate simple 
signalling systems as well. Many useful analogies to SI thus 
become apparent. Swarms of insects are analogous to parallel 
processors, simple signals in SI are analogous to the scalar 
Lebesgue measure or some estimate of it, maximization of this 
Lebesgue measure is analogous to movement along a Pareto 
optimal frontier, hence a type of self-organized behavior. This 
also suggests an interesting, and perhaps profound, unity of 
concepts between and among the various paradigms described 
here. 

4) Game Theory: The dynamics of insect colonies involve 
competition and cooperation among them and so unavoidably 
involve aspects of game theory. The dynamics of competi- 
tion and cooperation constitute the major components of the 
more interesting aspects of game theory [40]. Game theoretic 
concepts such as the famous Nash Equilibrium are really 
statements about Pareto optimality and how interacting entities 
achieve the greatest utility in a changing environment of 
competition and cooperation [41, p. 193,339]. 

5) Markov Random Fields: The theory of Markov Random 
Fields (MRFs) shares a great deal in common with that of 
Boltzmann Machines, SA and CA. As alluded to earlier in the 
paragraphs on Efficient Tradeoffs ...), some subsystem states 
may not be allowed to coexist with neighboring subsystem 
states. MRFs can model this type of behavior and have the 
desirable Markov property in terms of spatial attributes. This 
provides a mathematical structure to these problems. A large 
body of useful theoretical results on MRFs exists (se e.g., [42]) 
and may be quite useful in addressing some of the issues in 
SI and how efficient tradeoffs of efficiency measures may be 
analyzed. 

IV. Conclusion 

The growth of communications and networked systems 
imposes on a us a need to explore new and imaginative 
ways to address the expected problems in managing these 
systems. Effectively utilizing the SI paradigm requires some 
frameworks on which to first build the relevant theory and then 
guide research toward practical implementations. This article 
examined three aspects of a meta-formalism that attempt to 
provide this necessary structure to further research into SI and 
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develop a more rigorous and mathematically sound theory for 
its analysis. The three elements of the meta-formalism involve 
1) a set of first principles based on the laws of nature, 2) 
a dynamical framework, and 3) a problem framework. The 
relevant laws of nature are those associated with the theory 
of evolution and the effects of natural selection. The theory 
of evolution provides the justification for describing efficient 
behavior in social insect colonies. This efficient behavior can 
be mathematically characterized and associated with Pareto 
optima. Thus, the self-organized behavior often associated 
with SI can be given a more mathematical treatment when 
described in terms of operating points that lie along a Pareto 
optimal frontier. Operation along this frontier also provides a 
way of mathematically describing adaptive behavior, one of the 
hallmarks of SI. These notions of efficiency and adaptability 
which are implications of the first principles can be given 
effect as the underlying concepts behind the second formalism 
component — the dynamical framework. 

The dynamical framework was described in terms of scale 
invariant Pareto optimality (SIPO), a property that requires that 
the behavior of entities within a swarm must be consistent with 
efficient behavior and tradeoffs of efficiency on many scales. 
SIPO thus provides a valuable set of conceptual constraints 
that focusses the research efforts, provides ideas that can be 
mathematically formalized and expressed, and features some 
of the mathematical elegance of self-similarity so often found 
in natural systems. 

Finally, the problem framework provides a recipe for ab- 
stracting the problem and, in effect, provides the modelling 
clay with which to mold SI research so that it is consistent with 
the first two formalism components. The problem framework 
employs the use of swarming finite state machines models 
which provide a way to model a continuum of complexity 
apparent in the SI exhibited by different species. 

In conclusion, it is the author's hope that this meta- 
formalism and the ideas in this article will help facilitate 
research and development using SI. The resulting theories have 
potential for application in many areas of inquiry involving 
complex systems. The goal of making these increasingly 
complex systems more efficient, scalable and autonomous will 
help to ensure their effectiveness well into the future. 
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